Reinforcement Causal Structure Learning on Order Graph
نویسندگان
چکیده
Learning directed acyclic graph (DAG) that describes the causality of observed data is a very challenging but important task. Due to limited quantity and quality data, non-identifiability causal graph, it almost impossible infer single precise DAG. Some methods approximate posterior distribution DAGs explore DAG space via Markov chain Monte Carlo (MCMC), over nature super-exponential growth, accurately characterizing whole intractable. In this paper, we propose Reinforcement Causal Structure on Order Graph (RCL-OG) uses order instead MCMC model different topological orderings reduce problem size. RCL-OG first defines reinforcement learning with new reward mechanism in an efficacy way, deep Q-learning update transfer rewards between nodes. Next, obtains probability transition nodes computes orderings. can sample obtain ordering high probability. Experiments synthetic benchmark datasets show provides accurate approximation achieves better results than competitive discovery algorithms.
منابع مشابه
Order-independent constraint-based causal structure learning
We consider constraint-based methods for causal structure learning, such as the PC-, FCI-, RFCIand CCDalgorithms (Spirtes et al. (2000, 1993), Richardson (1996), Colombo et al. (2012), Claassen et al. (2013)). The first step of all these algorithms consists of the PCalgorithm. This algorithm is known to be order-dependent, in the sense that the output can depend on the order in which the variab...
متن کاملReinforcement learning and causal models
This chapter reviews the diverse roles that causal knowledge plays in reinforcement learning. The first half of the chapter contrasts a “model-free” system that learns to repeat actions that lead to reward with a “model-based” system that learns a probabilistic causal model of the environment which it then uses to plan action sequences. Evidence suggests that these two systems coexist in the br...
متن کاملLearning Higher-Order Graph Structure with Features by Structure Penalty
In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn th...
متن کاملAbolishing the effect of reinforcement delay on human causal learning.
Associative learning theory postulates two main determinants for human causal learning: contingency and contiguity. In line with such an account, participants in Shanks, Pearson, and Dickinson (1989) failed to discover causal relations involving delays of more than two seconds. More recent research has shown that the impact of contiguity and delay is mediated by prior knowledge about the timefr...
متن کاملPartial Order Hierarchical Reinforcement Learning
In this paper the notion of a partial-order plan is extended to task-hierarchies. We introduce the concept of a partial-order taskhierarchy that decomposes a problem using multi-tasking actions. We go further and show how a problem can be automatically decomposed into a partial-order task-hierarchy, and solved using hierarchical reinforcement learning. The problem structure determines the reduc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i9.26274